Structure clustering for Chinese patent documents
نویسندگان
چکیده
This paper aims to cluster Chinese patent documents with the structures. Both the explicit and implicit structures are analyzed to represent by the proposed structure expression. Accordingly, an unsupervised clustering algorithm called structured self-organizing map (SOM) is adopted to cluster Chinese patent documents with both similar content and structure. Structured SOM clusters the similar content of each sub-part structure, and then propagates the similarity to upper level ones. Experimental result showed the maps size and number of patents are proportional to the computing time, which implies the width and depth of structure affects the performance of structured SOM. Structured clustering of patents is helpful in many applications. In the lawsuit of copyright, companies are easy to find claim conflict in the existent patents to contradict the accusation. Moreover, decision-maker of a company can be advised to avoid hotspot aspects of patents, which can save a lot of R&D effort. 2007 Elsevier Ltd. All rights reserved.
منابع مشابه
A Clustering Method of Highly Dimensional Patent Data Using Bayesian Approach
Patent data have diversely technological information of any technology field. So, many companies have managed the patent data to build their R&D policy. Patent analysis is an approach to the patent management. Also, patent analysis is an important tool for technology forecasting. Patent clustering is one of the works for patent analysis. In this paper, we propose an efficient clustering method ...
متن کاملخوشهبندی فراابتکاری اسناد فارسی اِکساِماِل مبتنی بر شباهت ساختاری و محتوایی
Due to the increasing number of documents, XML, effectively organize these documents in order to retrieve useful information from them is essential. A possible solution is performed on the clustering of XML documents in order to discover knowledge. Clustering XML documents is a key issue of how to measure the similarity between XML documents. Conventional clustering of text documents using a do...
متن کاملWhite Paper Text Clustering on Patents
Overview Amongst various analyses performed on patents, the area where specialized software helps immensely is text‐mining and two of the most popular text mining techniques used over patent data are: Text segmentation / Tokenization Text Clustering / Topic identification Text segmentation is a process of analyzing the patent text and identifying smaller meaningful segments from the text. T...
متن کاملAbstract: Vacant Technology Forecasting based on Patent Analysis Using an Ensemble Method and Bayesian Clustering
Patent analysis is an important approach to technology forecasting because patents are an important component of developing technology. Also, we use the results of technology forecasting to build the R&D strategies efficiently. In this paper, we consider patent clustering as one of patent analyses. That is, we cluster patent documents in order to forecast the vacant area of a given technology f...
متن کاملBuilding a Statistical Machine Translation System for Translating Patent Documents
This paper describes the work we conducted for building a statistical machine translation (SMT) system for the Chinese-English subtask of the NTCIR-9 patent MT evaluation. Our results show that most of the generic techniques we had developed for improving SMT performance work on patent data as well, and the changes we made to our SMT system training procedure in order to address special charact...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Expert Syst. Appl.
دوره 34 شماره
صفحات -
تاریخ انتشار 2008